AD-A210  605 


■^Tlt  fll-t  Cut-i 

ARI  Research  Note  89*22 


Effects  of  Stress  on  Judgment  and  Decision 
Making  in  Dynamic  Tasks 


Kenneth  R.  Hammond 
University  of  Chicago 

for 

Contracting  Officer’s  Representative 
Judith  Orasanu 


Office  of  Basic  Research 
Michael  Kaplan,  Director 


May  1989 


DTIC 

ELECTE 

JUL211989 

B 


D 


United  States  Army 

Research  Institute  for  the  Behavioral  and  Social  Sciences 


Approved  for  the  public  release;  distribution  is  unlimited. 


89  7  21  012 


U.S.  ARMY  RESEARCH  INSTITUTE 

FOR  THE  BEHAVIORAL  AND  SOCIAL  SCIENCES 


A  Field  Operating  Agency  Under  the  Jurisdiction 
of  the  Deputy  Chief  of  Staff  for  Personnel 


EDGAR  M.  JOHNSON  JON  W.  BLADES 

Technical  Director  COL,  IN 

Commanding 


Research  accomplished  under  contract 
for  the  Department  of  the  Army 

University  of  Colorado 


Technical  review  by 
Judith  Orasanu 


NOTICES 

DISTRIBUTION:  This  report  has  been  cleared  fo  release  to  the  Defense  Technical  Information 
Center  (DTIC)  to  comply  with  regulatory  requirements.  It  has  been  given  no  primary  distribution 
other  than  to  DTIC  and  will  be  available  only  through  DTIC  or  the  National  Technical  Informational 
Service  (NTIS). 

FINAL  DISPOSITION:  This  report  may  be  destroyed  when  it  is  no  longer  needed.  Please  do  not 
return  it  to  the  U.S.  Army  Research  Institute  for  the  Behavioral  and  Social  Sciences. 

NOTE:  The  views,  opinions,  and  findings  in  this  report  are  those  of  the  author(s)  and  should  not  to 
be  construed  as  an  oriicial  Department  of  the  Army  position,  policy,  or  decision,  unless  so 
designated  by  other  authorized  documents. 


'UNCLASSIFIED 


1«.  REPORT  SECURITY  CLASSIFICATION 
Unclassified 


2a.  SECURITY  CLASSIFICATION  AUTHORITY 


2b.  DECLASSIFICATION /DOWNGRADING  SCHEDULE 


4.  PERFORMING  ORGANIZATION  REPORT  NUMB£R(S> 


REPORT  DOCUMENTATION  PAGE 


lb.  RESTRICTIVE  MARKINGS 


Form  Approved 
0MB  No.  0704-0188 


6a.  NAME  OF  PERFORMING  ORGANIZATION 
Uliversity  of  Colorado 


6b.  OFFICE  SYMBOL 
(If  epplicable) 


6c  ADDRESS  (City,  State,  and  ZIP  Code) 

Center  for  Research  on  Judgment  and  Policy 
Campus  Box  344  -  University  of  Colorado 
Boulder,  CO  80309-0344 


8a.  NAME  OF  FUNDING /SPONSORING 

ORGANIZATION  U.S.  Army  Research 
Institute  for  the  Behavioral 
and  Social  Sciences 


8c  ADDRESS  CGty,  State,  ar>d  ZIP  Code) 


8b.  OFFICE  SYMBOL 
(If  applicable) 

PERI-BR 


3.  DISTRIBUTION /AVAILABILITY  OF  REPORT 

Approved  for  public  release; 
distribution  is  unlimited. 


5.  MONITORING  ORGANIZATION  REPORT  NUMBER(S) 
ARI  Research  Note  89-22 


7a.  NAME  OF  MONITORING  ORGANIZATION 
U.S.  Army  Research  Institute 
Office  of  Basic  Research 


7b.  ADDRESS  (Oty,  State,  and  ZIP  Code) 

5001  Elsenhower  Avenue 
Alexandria,  VA  22333-5600 


9.  PROCUREMENT  INSTRUMENT  IDENTIFICATION  NUMBER 

MDA903-86-C-0142 


to.  SOURCE  OF  FUNDING  NUMBERS 


PROGRAM 
ELEMENT  NO. 

61102B 


5001  Eisenhower  Avenue 
Alexandria,  VA  22333-5600 


11.  TITLE  (Indude  Security  Claaifieation) 

Effects  of  Stress  on  Judgment  and  Decision  Making  in  Dynamic  Tasks 


PROJECT 

TASK 

NO 

NO. 

74F 

WORK  UNIT 


12.  PERSONAL  AUTHOR(S) 

Hammond,  Kenneth  R.  (University  of  Colorado) 


13a.  TYPE  OF  REPORT 
Interim 


14.  DATE  OF  REPORT  (Year,  Month,  Day)  115.  PAGE  COUNT 
1989,  May  ' 


COSATI  CODES 


GROUP  SUB-GROUP 


18.  SUBJECT  TERMS  (Continue  on  reverse  if  necessary  and  identify  by  block  number) 

*  Dynamic  tasks  ^  Uncertainty^  ,  r 
Experts- 

Judgment  ^  _ 


19.  ABSTRACT  (Continue  on  reverse  if  rtecessary  and  identify  by  block  number) 

'  >  Studies  of  expert  microburst  forecasters  were  conducted.  Two  studies  yielded  results 
confirming  the  validity  of  a  linear  model  of  expert  judgment  and  the  meaningfulness  of 
profiles  as  representations  of  weather  phenomena.  A  simulation  demonstrated  that  a  simple 
scientifically  and  empirically  ignorant  forecasting  model  could  perform  as  well  as  a 
sophisticated  scientifically  informed  algorithm.  A  study  conducted  under  dynamic  and  highly 
representative  forecasting  conditions  yielded  the  following  major  findings:  (a)  agreement 
on  precursor  values  was  low  to  moderate,  setting  the  possible  upper  limit  on  forecasting 
accuracy;  (b)  agreement  on  microburst  probabilities  was  lower  under  the  highly  representa¬ 
tive  situation  than  in  our  "best  case  scenario"  study;  and  (c)  new  information  received  over 
time  had  very  little  impact  on  the  experts’  judgments.  I\ 


20.  DISTRIBUTION /AVAILABILITY  OF  ABSTRACT 
[S  UNCLASSIFIEDAJNLIMITED  □  SAME  AS  RPT. 


22a.  NAME  OF  RESPONSIBLE  INDIVIDUAL 
Judith  Orasanu 


OD  Form  1473.  JUN  86  i 


□  OTIC  USERS 


21.  ABSTRACT  SECURITY  CLASSIFICATION 
Unclassified 


22b.  TELEPHONE  (Include  Area  Code)  22c.  OFFICE  SYMBOL 
(202)  274-8722  PERI-BR 


Previous  editions  are  obsolete. 


SECURITY  CLASSIFICATION  OF  THIS  PAGE 
UNCLASSIFIED 


i 


EFFECTS  OF  STRESS  ON  JUDGMENT  AND  DECISION  MAKING  IN  DYNAMIC  TASKS 


Goals 

The  principal  goals  of  this  research  project  are  to  (a)  discover  the 
nature  of  judgment  and  decision  making  in  dynamic  tasks  and  (b)  study  the 
effects  of  stress  on  judgment  and  decision  making  in  such  tasks.  Neither 
project  has  been  undertaken  previously  by  researchers  in  this  field. 

The  research  carried  out  is  best  summarized  in  the  report  of  May  1988 
in  which  the  results  of  six  studies  of  expert  decision  making  in  dynamic 
tasks  are  described.  The  six  studies  are  cumulative,  that  is.  each  study  is 
a  logical  consequence  of  its  predecessor;  the  final  study  (VI)  is  of  great¬ 
est  importance  because  it  involves  experts  making  judgments  in  a  dynamic 
task  situation  that  is  highly  representative  of  their  working  conditions. 

The  results  thus  obtained  carry  an  authenticity  not  ordinarily  available.  A 
report  describing  Study  VI  (Appendix  A)  and  a  summary  of  the  studies  preced¬ 
ing  it  together  with  research  recommencfations  (Appendix  B)  are  appended  to 
this  report. 

Since  May  1988  we  have  made  further  analyses  of  the  probability  judg¬ 
ments  of  the  experts  regarding  microburst  events.  This  work  was  described 
in  the  eighth  Quarterly  Report  (see  Appendix  C).  The  most  significant 
findings  were  that  (a)  new  information  received  over  time  had  very  little 
impact  on  the  experts'  judgments  and  (b)  the  experts  were  very  poorly 
coordinated  with  one  another. 

Thus  we  found  a  group  of  five  experts  working  together  on  an  important 
problem  for  several  years,  yet  never  comparing  their  performances.  We  also 
found  that  well-known  psychological  research  procedures  produced  information 
heretofore  unknown— and  unsought— that  was  fundamental  to  that  research  ef¬ 
fort  (detecting  and  forecasting  microbursts).  Our  research  on  hail  fore¬ 
casting  produced  similar  results.  In  short,  the  most  significant  results  of 
our  work  are  that  (a)  important  information  regarding  expert  judgment  in 
dynamic  tasks  can  be  produced  rapidly  with  standard  research  techniques, 

(b)  that  such  information  will  not  be  produced  unless  and  until  such  tech¬ 
niques  are  applied,  and  (c)  it  is  unlikely  that  they  will  be  applied  in  the 
future  unless  psychologists  suggest  them.  To  what  extent  these  conclusions 
apply  to  experts  working  in  other  domains  is  uncertain. 

Our  theoretical  work  has  focused  on  the  distinction  between  pattern¬ 
matching  processes  and  processes  involving  functional  relations.  Hammond 
worked  out  a  general  theoretical  framework  that  encompasses  these  two  major 
concepts  and  shows  how  they  are  related  to  one  another.  This  work  is 
reported  by  K.  R.  Hammond,  Judgment  and  decision  making  in  dynamic  tasks, 
soon  to  be  published  in  Large  Scale  Systems. 


1 


Reports 


Hammond,  K.  R.  (in  press).  Information  models  for  intuitive  and  analytical 
cognition.  In  A.  Sage  (Ed.),  Concise  encyclopedia  of  information 
processing  in  systems  and  organizations.  Oxfori:  Pergamon  Press. 

Hammond,  K.  R.  (in  press).  Judgment  and  decision  making  in  dynamic  tasks. 
Large  Scale  Systems. 

Lusk,  C.  M.,  Stewart,  T.  R.,  &  Hammond,  K.  R.  (1988).  Judgment  and  decision 
making  in  dynamic  tasks;  The  case  of  forecasting  the  microburs^ 
tlechnical  Report  Ho.  Z84).  Boulder:  University  of  Colorado,  Center 
for  Research  on  Judgment  and  Policy. 

Lusk,  C.  M.,  Stewart,  T.  R.,  &  Hammond,  K.  R.  (1988).  Toward  the  study  of 
judgment  and  decision  making  in  dynamic  tasks;  The  case  of  forecasting 
the  microburst.  (Yechnical  Report  No.  278).  Roulder:  University  of 
Colorado,  Center  for  Research  on  Judgment  and  Policy. 

Stewart,  T.  R.,  Moninger,  W.  R.,  Grassia,  J.,  Brady,  R.  H.,  &  Merrem,  F.  H. 
(1988).  Analysis  of  expert  judgment  and  skill  in  a  hail  forecasting 
experiment  uechm cal  Report  No.  ilZ).  Bou Ider:  University  of 
Colorado,  Center  for  Research  on  Judgment  and  Policy. 


Forthcoming  Presentation 

Potts,  R.,  Lusk,  C.,  Hammond,  K.,  A  Stewart,  T.  (1989).  Expert  judgment  in 
the  nowcasting  of  microbursts.  Paper  to  be  presented  at  the  Third 
Annual  International  Conference  on  the  Aviation  Weather  Systems, 
Anaheim,  CA. 


Accession  For 


NTIS  GRAil 
DTIC  TAB 
Unauiiounoed 
Justification- 


— — 

Dlstr 

Aval 

1  but ion/ 

Lability  Codes 

Dlst 

P'l 

Avail  dU3c]/or 
Special 

! 

LJ _ 

2 


□  □ 


study  VI; 


Appendix  A 

A  Laboratory  Experiment:  Precursor  Assessment  by  Forecasters 

Cynthia  M.  Lusk 
Kenneth  R.  Hammond 


May  1988 


Center  for  Research  on  Judgment  and  Policy 
Institute  of  Cognitive  Science 
University  of  Colorado,  Boulder 


A1 


Precursor  Assessment 
Lusk,  Haimond 


The  purpose  of  this  report  Is  to  present  the  preliminary  results  of  an 
experiment  conducted  to  study  the  cognitive  aspects  of  the  nowcasting  of 
microbursts.  In  particular,  the  report  focuses  on  analyses  of  those  data 
most  pertinent  to  NCAR's  immediate  needs.  In  a  previous  report  (Lusk, 
Stewart,  and  Hammond,  1988),  we  have  outlined  a  hierarchical  model 
depicting  the  steps  between  the  storm  environment  and  a  Judgment  about 
microbursts,  which  Is  presented  In  Figure  1.  The  links  between  each  phase 
In  Figure  1  represent  points  at  which  forecasters'  Judgment  processes  are 
Involved.  One  of  our  previous  studies  (Studjy  III)  had  Indicated  some 
degree  of  disagreement  among  the  forecasters  regarding  extraction  of  the 
precursor  values  from  drawings  of  radar  data  and  clouds.  The  present  study 
was  conducted  to  clarify  those  findings;  here  we  use  a  situation  more 
representative  of  that  In  which  the  forecasters  normally  operate. 

Procedure 


The  subjijts  In  this  experiment  were  four  of  the  five  microburst 
forecasters  who  participated  In  our  previous  studies. 

The  experiment  was  conducted  In  two  phases.  We  began  the  first  phase 
according  to  the  procedure  outlined  in  our  2  March  1988  research  proposal 
and  detailed  below.  After  completion  of  two  cases  (one  microburst  and  one 
null  case),  the  study  was  halted  for  a  preliminary  assessment  of  the 
procedure  and  results.  The  agreement  among  the  forecasters  was  found  to  be 
so  low  that  a  meeting  was  held  to  discuss  whether  further  documentation  of 
the  conclusion  would  be  cost  beneficial.  It  was  decided  that  further  data 
would  be  worth  acquiring  and  the  experimental  procedure  was  modified  to 
collect  those  data.  The  procedures  are  detailed  below. 

Overview 


During  each  experimental  session,  the  forecaster  was  seated  In  front 
of  a  large  computer  terminal  used  to  present  color  Doppler  radar  displays. 
The  experimenter  was  seated  In  front  of  a  computer  terminal  that  was  used 
to  run  the  experimental  session.  At  the  first  session  of  each  phase  of  the 
experiment,  the  forecasters  were  presented  with  Instructions  regarding  how 
the  experiment  would  proceed.  The  forecasters  were  presented  with  a  volume 
of  radar  data,  after  which  they  made  Judgments  of  precursor  values  and  the 
probability  of  a  microburst.  The  presentation  of  data  and  making  of 
Judgments  was  repeated  until  completion  of  each  case. 

The  Cases 

Six  cases  were  used  to  generate  the  data  In  this  report:  two  In  the 
first  phase  and  four  In  the  second  phase.  Half  of  the  cases  In  each  phase 
were  null  cases  and  half  were  microburst  cases. 


A2 


Precursor  Assessment 
Lusk,  Hanrnnnd 


Each  case  was  arranged  on  a  tape  In  consecutive  volumes.  Each  volume 
consisted  of  13  scans,  starting  with  either  the  .5  or  1.1  scan  and 
terminating  with  either  the  34.8  or  the  39.9  scan.  In  the  first  phase. 

Case  1  Included  six  volumes.  The  data  for  Case  2  spanned  eight  volumes. 
However,  one  volume  was  skipped  due  to  faulty  data.  In  addition,  one 
volume  In  Case  2  only  Included  the  lower  seven  scans.  However,  judgments 
were  still  collected  for  that  short  volume.  In  the  second  phase  all  cases 
Included  four  volumes  of  data.  Each  case  terminated  before  the  microburst 
was  evident  on  the  lowest  scan  or  before  any  obvious  or  substantial 
decrease  In  the  Intensity  or  height  of  the  cell  In  the  null  cases. 

The  Judgments 

The  forecasters  were  asked  to  make  judgments  of  the  six  precursor 
values  they  had  Indicated  to  be  the  cues  In  Study  I:  descending  core, 
collapsing  storm,  convergence/divergence  above  cloud  base, 
convergence/divergence  at  or  below  cloud  base,  notch,  and  rotation.  In 
addition,  forecasters  made  judgments  of  the  probability  of  a  microburst 
occurring  In  the  next  5  to  10  minutes. 

The  judgments  regarding  precursor  values  and  probability  of  a 
microburst  were  made  on  the  same  scales  as  utilized  In  our  previous 
studies.  In  addition,  to  the  right  of  each  rating  scale  was  a  blank  for 
the  forecasters  to  Insert  their  confidence  In  their  precursor  judgments. 

The  rating  sheet  Is  shown  In  Figure  2. 

In  the  first  phase,  judgments  were  made  after  each  volume..  Therefore, 
judgments  were  made  six  times  for  Case  1  and  seven  times  for  Case  2.  In 
the  second  phase  judgments  were  made  after  all  but  the  first  volume.  Thus, 
three  judgments  were  made  for  each  of  the  four  cases  In  the  second  phase. 

The  Experimental  Session 

At  the  beginning  of  the  first  session  in  each  phase,  the  forecasters 
were  provided  with  Instructions  to  read.  For  the  first  phase  the 
Instructions  explained  how  the  experimental  sessions  would  proceed.  They 
explained  that  each  case  consisted  of  several  volume  scans  over  time  of  a 
cell  that  did  or  did  not  produce  a  microburst,  starting  with  the  lowest 
scan  at  the  earliest  time.  When  they  were  finished  observing  each  scan, 
the  forecasters  were  Instructed  to  tell  the  experimenter  that  they  were 
ready  for  the  next  level  scan.  The  forecasters  were  given  up  to  thirty 
seconds  to  view  each  scan.  After  completion  of  a  volume  In  this  manner, 
the  forecasters  filled  In  the  rating  sheet.  In  addition,  the  Instructions 
stated,  in  part: 

At  the  time  of  the  first  volune  you  can  assume  that  a 
microburst  Is  not  presently  occurring.  Please  assume  before 
observing  the  first  scan,  that  on  the  basis  of  prior  Information 


Precursor  Assessment 
Lusk,  Hammond 


(morning  soundings,  etc.)  you  have  already  reached  the  conclusion 
that  the  likelihood  of  a  microburst  on  this  day  Is  .50.  Then 
adjust  your  probabilities  of  a  microburst  after  observing  the 
radar  data.  Each  case  will  terminate  prior  to  evidence  of 
outflow  from  a  microburst  or  evidence  that  the  storm  Is  obviously 
dissipating. 

Finally,  the  forecasters  were  given  Instructions  to  think  aloud. 

The  Instructions  for  the  second  phase  explained  the  changes  In  the 
experimental  procedure.  The  forecasters  were  Informed  that  they  would 
receive  the  noon  sounding  data,  view  only  four  volumes  of  data,  and  make 
judgments  as  In  the  first  phase  after  the  second  through  fourth  volumes. 

In  addition,  the  Instructions  explained  that  the  scans  would  be  presented 
continuously  and  that  they  would  not  need  to  think  aloud. 

The  forecasters  were  provided  with  blank  paper  for  taking  notes  and 
felt  tip  pens  to  mark  the  screen.  The  date  for  each  case  was  masked  on  the 
computer  screen.  At  the  beginning  of  each  case,  the  forecasters  were  told 
the  coordinates  where  the  cell  they  were  to  attend  to  was  presently 
located. 

Prior  to  presentation  of  each  case  In  the  second  phase,  the 
forecasters  were  presented  with  the  eleven  o'clock  sounding  for  the  day 
from  which  that  case  was  drawn.  The  subjects  were  then  asked  what  the 
probability  of  a  microburst  occurring  was,  based  on  the  sounding 
Information  alone. 

In  the  first  phase,  half  of  the  forecasters  were  presented  with  Case  1 
first,  and  half  were  presented  with  Case  2  first.  In  the  second  phase,  the 
cases  were  arranged  on  a  tape  In  a  fixed  order.  Each  forecaster  began  with 
a  different  case,  but  otherwise  the  order  of  presentation  was  fixed. 

Results  and  Discussion 

Verbal  Prulocols 

Examples  of  the  verbalizations  are  provided  in  the  Appendix.  Although 
no  formal  analyses  have  been  completed  on  the  verbal  protocols  collected 
during  the  experimental  sessions.  Informal  Inspection  Indicates  that  during 
observation  of  the  radar  data  the  forecasters  were  primarily  operating  at 
Phases  D  and  E  In  our  hierarchical  judgment  model  (see  Figure  1).  That  Is, 
the  verbalizations  primarily  concern  translating  the  radar  data  to 
Information  such  as  the  maximum  reflectivity  values,  convergence  or 
divergence,  and  noting  the  occurrence  of  features  such  as  a  notch  at  each 
level  scan.  The  dynamic  nature  of  the  task  was  evident  In  the 
verbalizations  when  the  forecasters  made  comparisons  of  reflectivity  or 
velocity  features  between  different  levels  or  times.  Such  comparisons  may 
also  Indicate  forecasters'  Integration  of  maximum  reflectivity  values  over 
time  and  height  Into  judgments  of  cue  values  such  as  descending  core. 


Precursor  Assessment 
Lusk,  Hammond 


Preliminary  review  of  the  verbalizations  made  at  the  time  of  judgments 
yields  little  insight  into  the  judgment  process.  For  the  most  part,  the 
forecasters  provide  a  dichotomous  yes  or  no  value  regarding  the  occurrence 
of  each  precursor,  then  decide  exactly  what  value  on  the  scale  to  circle. 
The  cognitive  process  for  making  the  probability  of  a  microburst  judgment 
was  not  evident.  Apparently  this  takes  place  on  an  Intuitive  level.  No 
calculations  or  applications  of  a  principle  or  formula  for  organizing  the 
information  are  evident  in  the  protocols.  This  result  makes  the  hypothesis 
that  forecasters  combine  information  In  a  linear  additive  fashion 
plausible. 

Rating  Sheets 

The  only  analyses  conducted  to  date  concern  the  agreement  between 
forecasters*  judgments  of  precursor  values  and  agreement  between 
forecasters'  judgments  of  the  probability  of  a  microburst.  The  data  used 
in  these  analyses  were  the  judgments  made  after  each  volume.  Thus,  25  data 
points  are  possible  for  each  subject  (some  volumes  have  a  slightly  lower 
number  of  data  points  in  Instances  where  forecasters  did  not  provide 
ratings).  The  correlations  between  the  judgments  of  each  pair  of 
forecasters  were  computed  for  each  precursor  and  are  presented  in  Tables  1 
through  6.  Similarly,  the  correlations  between  judgments  of  the 
probability  of  a  microburst  were  computed  and  are  presented  in  Table  7. 

Tables  1  through  7  clearly  indicate  a  lack  of  agreement  between 
forecasters  regarding  both  the  precursor  and  probability  judgments. 

Although  many  of  the  correlations  were  substantially  larger  than  zero  (and 
are,  in  fact,  statistically  significant),  they  are  all  substantially 
smaller  than  one  or  any  other  level  of  acceptable  agreement. 

Comparison  of  Tables  1  through  6  indicates  a  higher  degree  of 
agreement  on  some  precursors  than  on  others.  Particularly  noteworthy  are 
the  low  and  even  negative  (!)  correlations  for  judgments  of  descending 
core.  This  result  is  particularly  important  because  this  precursor  is  the 
one  which  forecasters  weighted  most  heavily  in  arriving  at  microburst 
probability  judgments  (as  indicated  in  Study  I).  Our  previous  Study  III 
also  indicated  some  disagreement  among  forecasters,  but  not  to  the  degree 
indicated  in  the  present  study.  The  present  study  is  a  much  better 
Indicator  of  the  degree  of  disagreement  given  its  representative  design. 
Thus  the  representative  conditions  produced  lower  rather  than  higher 
agreement,  in  opposition  to  expectations.  The  higher  agreement  in  Study 
III  may  be  due  In  part  to  the  fact  that  those  judgments  were  made  with 
clearly  delineated  schematic  cloud  and  radar  drawings,  rather  than  actual 
radar  data. 

Agreement  regarding  precursor  values  was  highest  for  the  two 
convergence  precursors,  second  highest  for  collapsing  storm  and  notch,  and 
lowest  for  rotation  and  descending  core.  Of  course,  future  research  will 


Precursor  Assessment 
Lusk,  Hammond 


need  to  address  how  agreement  can  be  Improved.  It  Is  possible  that  the 
different  levels  of  agreement  between  precursors  may  be  due  to  the 
different  levels  of  abstraction  or  stages  necessary  to  make  Judgments  of 
the  precursor  values.  For  example,  convergence  Is  perhaps  the  precursor 
value  ""  St  directly  obtained  (from  the  radar  velocities).  In  contrast,  the 
descending  core  judgment  requires  that  the  forecaster  combine  Information 
about  maximum  reflectivity  values  over  times  and  heights. 

Note  that  there  Is  one  very  high  correlation  regarding  the  probability 
of  a  micpoburst  In  Table  7:  that  between  forecasters  1  and  4  of  .88,  a 
result  that  raises  some  Interesting  Issues.  First,  note  that  although 
these  two  forecasters  are  In  high  agreement  regarding  the  probability  of  a 
microburst,  their  agreement  regarding  the  value  for  descending  core  Is 
essentially  zero  (Table  1).  In  Study  I,  both  of  these  forecasters  gave  the 
highest  weight  to  descending  core  among  the  six  precursors.  In  the  present 
study  they  show  no  agreement  regarding  that  cue  value,  yet  they  show  high 
agreement  regarding  the  probability  of  a  microburst.  Such  a  finding  Is 
puzzling  and  deserves  a  great  deal  of  consideration.  Possibly  the 
forecasters  are  utilizing  some  other  Information  In  arriving  at  their 
microburst  probability  Judgment.  Second,  other  statistical  analyses  may 
yield  Insight  Into  the  discrepancy.  For  example,  analyses  run  separately 
on  the  null  and  microburst  cases  show  that  the  high  correlation  for 
probability  of  a  microburst  Judgment  Is  to  a  large  extent  due  to  agreement 
on  the  microburst  cases  (r  *  .95),  rather  than  agreement  on  the  null  cases 
(r  *  .47).  However,  a  striking  result  Is  that  for  both  the  microburst  and 
null  cases,  the  correlations  between  the  forecasters'  Judgments  of 
descending  core  are  negative  (r  »  -.38  for  microburst  cases,  r  »  -.18  for 
null  cases).  Similar  comparisons  may  be  made  for  other  forecasters  and 
Judgments. 


Conclusions  and  Implications 

The  most  Important  and  clear  cut  finding  from  these  preliminary 
analyses  Is  a  pervasive  lack  of  agreement  among  the  forecasters'  Judgments 
of  precursor  values.  Although  In  many  cases  the  level  of  agreement  Is  at  a 
moderate  level.  It  Is  Important  to  note,  as  we  have  previously  Indicated, 
that  the  level  of  measurement  at  any  level  In  the  Judgment  process  (see 
Figure  1)  sets  the  upper  level  for  accuracy  at  the  final  stage  of 
microburst  prediction. 

We  have  demonstrated  how  the  analysis  of  the  cognitive  aspects  of 
forecasting  can  help  delineate  the  Judgment  process  and  potential  sources 
of  error.  Continued  application  of  this  approach  would  be  helpful  for 
Improving  agreement  (and  possibly  accuracy).  A  first  step  may  be  to  use 
our  hierarchical  approach  In  decomposing  the  precursor  Judgments  Into  their 
components,  much  as  we  have  In  Figure  1.  That  Is,  one  would  want  to 
delineate  what  features  In  the  radar  data  are  cues  for  each  precursor  and 
how  those  features  are  combined.  Such  an  exercise  would  prove  useful  for 


A6 


Precursor  Assessment 
Lusk,  Haimond 


Appendfx 

Example  Protocols  for  Study  VI 

Subject  1^ 

Case  1 

S:  Okay  where  are  we?  The  number  of  the  next  volume,  4.  Ah  what's 
happening?  Uh  very  weak  divergent  flow  at  the  surface,  very  weak,  only 
three  meters  per  second.  And  we've  got  about  55.  It's  55.  Very  weak. 

Huh  again  we're  we  see  at  these  55,  we  get  divergence  again  above.  See  It 
really  looks  like  we're  getting  a  little.  It's  diverging  out  above  cold 
air,  but  1t's  weak.  And  It  gone,  oh  wow.  We  get  some  actualy  60  this 
time,  reflectivity.  A  lot  more  reflectivity.  And  actually  we're  showing  a 
little  convergence  now.  Oh  wow  it's  up  to  60  now.  But  velocity  feature 
not  very  strong,  slight.  Still  60,  no  good  velocity  feature.  I'm  not  wild 
about  the  angle  we're  getting  now.  If  there  were  convergence  In  that  core 
we  wouldn't  see  It  well.  Now  at  55,  I'll  call  It  now,  it's  Just  only  a 
touch  of  60.  Slight  Indication  of  that  notch  Is  at  this  level,  now.  This 
Is  15  6  [pause]  there's  xxxx  convergence  Into  that  too,  hmm.  Nice  notch 
now,  reflectivity  55.  Can't  see  an  obvious  velocity  feature  With  It 
though.  Here's  where  we  get  the  convergence.  45,  45  convergence.  Okay 
we've  lost  a  lot  of  reflectivity  now.  And  we,  now  we're  actually 
divergence.  It's  slipping  down  Into  the  about  45,  maybe  40,  at  30  degrees. 

Oh  It's  gone  only  25  left  so  we  have  a  real  collapsing  case  here.  Boy  that 

was  faster  wasn't  it. 

E:  yeah 

S;  I  had  to  move.  Just  trying  to  see  xxx  [silence]  The  top's  coming 
down.  Okay  now  uh  descending  reflectivity  core,  yeah  It's  still.  It's  not 
one  of  the  obvious,  the  most  obvious  cases  1n  the  world,  but  It's  still 
descending.  I'll  put  a  7,  confidence  is  only  about  50  percent.  Collapsing 

storm,  1t  1s  collapsing  but  It's  not  the  most  obvious  one  you  ever  saw.  So 

I'll  put  7,  confidence  at  60  percent.  Organized  convergence  above  cloud 
base,  yes  It's  still  there.  It's  still,  and  It's  actually  descended 
slightly  with  time  I  see.  Not  much.  It's  still.  It's  still  primarily  In 
the  three  to  four  kilometer  zone  which  Is  a  good  zone  for  It.  It's  not 
that  strong  and  organized.  I  put  confidence  only  at  60,  meaning  I  don't 
think  It's  all  that  significant.  Organized,  there's  still  a  divergence 
below  cloud  base,  and  I  really  think  that  may  be  significant.  Urn  I'm  going 
to  put.  I'm  circling  the  one  and  two,  saying.  I'm  putting  70  on  It  cause  I 
think  the  outflow  Is  really  divergent  above  the  cold  air.  It  may  not  make 
It  to  the  surface  very  strong.  Good  reflectivity  notch  now  between  2  and  3 
kilometers.  I'll  put  a  9  on  It,  confidence  Is,  well  It's  there,  90. 
Rotation  was  urn  not  as  good.  It  was  weak.  Last  time  I  think  I  had  weak.  I 
xxx  put  down  a  6.  Urn  confidence  Is  only  50  percent.  Okay  now  If  we're 
going  to  have  a  microburst  that's  going  to  occur  In  this  period.  I'm  not 


A8 


Precursor  Assessment 
Lusk,  Hammond 


very,  I  think  ft's  only  going  to  be  a  very  weak  outflow  though  cause  the 
reasons  I've  given.  Last  time  I  gave  25  percent.  I'll  go  with  30  and  hope 
I 'm  right. 


Subject  2 
Case  2 


S;  [Silence]  Okay  max  reflectivity  here  is  55.  Still  got  weak 
convergence  delta  V  Is  3,  okay.  [Pause]  55  again,  two  point  two.  Weak 
convergence  again.  Okay,  xxx  don't  see  It  this  time.  4  and  a  half 
degrees,  55.  Urn  still  convergent  weakly  delta  V  is  3,  okay.  [Pause]  6.7 
xxx  55.  [Silence]  Urn  not  much  going  on  that's  really  different,  okay.  8.8 
is  55.  55  [pause]  hmm.  A  suggestion  of  xxx  divergence  on  the  north  edge 

of  cell,  delta  V  is  about  3.  It's  still  pretty  weak,  okay.  [Silence]  50 
06Z,  11  degrees.  Got  that  wind  change  xxx,  okay.  [Pause]  50  D6Z  again. 
[Silence]  Okay.  [Silence]  Well  that’s  Interesting,  huh.  50  DBZ, 
??erosion??  echo  in  the  back.  Notch  is  still  there.  It's  kind  of  filling 
in  though,  there's  mid-line  with  more  echo  to  the  west  of  the  cell  than 
there  has  been  previously.  [Silence]  Cyclonic,  anti-cyclonic  couplet 
there.  Urn  okay  [silence]  50  DBZ,  this  storm  really  is  tilted  in  height. 
Sort  of  see  convergence  xxxx  weak  xxxxx  [silence]  okay.  [Silence]  okay  xxx 
DBZ  [silence]  There's  some  shear  areas  but  nothing  really  significant. 

This  is  22  degrees,  urn  [pause]  okay.  [Silence]  45,  again  we\e  gotten  a 
couple  of  shear  areas.  Cyclonic,  anti-cyclonic  shear  not  real  couples  to 
speak  of  [pause]  okay.  [Silence]  xxx  xxxx  [pause]  cyclonic,  anti-cyclonic 
shear  okay.  [Pause]  The  cell's  falling  apart  xxx.  35  DBZ.  There's 
convergence  ??in  the  anvil??,  [mumbles]  6.  [silence] 

S:  Uh  reflectivities  are  still  maintaining  themselves  pretty  well. 
[Silence]  Slightly  increasing  aloft  and  then  decreasing  at  the  very  highest 
angle.  So  we  don't  have  a  descending. core.  And  it's  not  collapsing. 
There's  no  real  convergence  above  cloud  base,  except  in  the  xxxx. 

[Silence]  Urn  [silence]  there's  not,  there's  convergence  at  or  below  cloud 
base,  xxxx  xxxx  kilometer,  there's  that  one  little  spot  of  divergence  ??at 
one  kilometer??  It's  really  weak  though.  [Silence]  The  notch  has  become 
weaker.  Not  as  well  defined.  And  there's  also  xxx  flow  xxx  so  I'm  going 
to  rotation,  no  there's  some  cyclonic  shear  and  that's  It.  Probability  of 
a  microburst  within  the  next  5  to  10  minutes.  I'm  still  going  to  stick  with 
the  50  percent. 


Subject  ^ 

Case  1 

S:  xxx  you  look  at  that  point  5  degree  velocity  and  there's  nothing 
there.  There  is  not  a  microburst  outflow.  There's  some  garbage  right 
there,  but  that's  not  real.  And  uh  looking  all  the  way  up  at  2.2  we  don't 


A9 


Precursor  Assessment 
Lusk,  Hammond 


really  see  any  divergence  or  velocity  structure.  And  we've  got  the  high 
reflectivity  xxx  so  unless  we  see  S(H»e  dramatic  Increases  In  velocity 
structure,  which  we  don't  really  see  here  at  4.5,  It's  going  to  be  awful 
hard  to  say  yes  we're  going  to  get  something.  And  uh  even  at  6.7  we're  not 
seeing  any  good  strong  velocity  features  associated  with  that  core.  [Long 
Silence]  slight  hint  that  there  may  be  convergence  coming  In  here  that  we 
can't  see  associated  with  that  notch.  And  xxx  Interesting  to  look  at  It 
from  a  from  a  radar  out  here  where  we  could  get  a  better  view.  Still 
seeing  that  notch,  but  again,  as  I  say.  It's  not  that  good  of  a  velocity 
structure.  I  did  see  some  sign  of  convergence  xxx.  [Silence]  Saw  some, 
xxx  [silence]  xxx  rotation  xxx  some  convergence  not  really  that  good. 
[Silence]  xxx  looks  about  the  same  as  It  was  before  [silence]  okay. 

S:  xxx  other  sheet  xxx  put  down  thing  xxx  for  can't  remember  for 
sure.  ??We  do  have??  some  descent  of  core.  The  storm  has  collapsed 
already.  I  think  there's  a  slight  xxx  still  kind  of  collapsing.  Uh  xxx 
not  really  much  happening  above  cloud  base  xxx.  xxx  Part  of  why  you  think 
collapsing  storm.  Slight  Indication  of  xxx.  We  got  an  Indication  xxx 
notch.  Well  nothing  happened  last  time.  Still  not  seeing  It,  we've  got 
the  high  reflectivity  down  so,  not  willing  to  say  no  chance  anymore,  but  uh 
got  to  start  backing  off  a  little  bit  on  that  probability.  I'll  be  a 
little  less  convinced  that  something's  going  to  happpen. 


Subject  5^ 
Case  2 


S:  Okay.  It  takes  It  forever.  Oh  we're  going  to  start  with  point  5. 
[Silence]  Oh  yeah,  this  guy's  racing  off  to  the  north,  and  55  OBZ  core, 
[pause]  And  a  little  convergent  shear  line  still  with  us  way  off  to  the 
south.  Oh  that's  what  happened  to  the  cell.  It  moved  off  of  Its 
convergent  line.  Now  It's  lost  Its  low  level  support.  It's  going  to 
crash,  okay.  [Silence]  Oh  that's  why  the  core  crashed  down  In  such  a 
hurry.  [Silence]  That's  right  I  did  see  a  sizeable  Increase  In 
reflectivity.  And  that's  what  happened  to  It.  Okay. 

E:  Is  that  an  okay  for  me? 

S;  Yeah,  that's  an  okay  okay.  [Silence]  Oh  gosh  60  DBZ.  [Silence] 
No  velocity  features  at  all  associated  with  the  cell  at  4.5  degrees. 
[Silence]  Surprised  It  hasn't  put  out  an  outflow,  okay.  Wonder  why  not? 
[Silence]  Oh  gee,  everything's  back  down  to  55  DBZ  now.  [Silence]  huh. 
Still  no  real  velocity  features.  It's  really  just  a  flat  field.  Okay. 
Notch  on  the  side.  [Silence]  huh  let's  see,  not  much  at  all  going  on. 
Strange,  we're  up  at  8.8  degrees  and  I  don't  see  much  of  anything,  huh. 
Okay,  go  to  the  next  one.  If  you  haven't  already.  [Silence]  50  to  55,  well 
a  little  bit  of  cyclonic  shear.  Certainly  a  notch.  Okay.  [Silence]  Oh 
another  cyclonic  shear  right  In  the  middle  of  the  cell.  [Silence]  Oh  yeah. 


ALO 


Precursor  Assessment 
Lusk,  Hammond 


a  little  bit  of  convergence  right  there,  okay.  [Silence]  Oh  hurry  up 
[silence]  yep,  a  little  bit  of  convergence  now  in  the  middle  of  the  cell. 
[Silence]  OkAy.  [Silence]  Oh  rotation  hanging  off,  way  off  on  the  end  out 
in  the  area  of  no,  not  much  signal.  Uh  now  we're  seeing  convergence 
peppered  about  here,  hither  in  the  thither.  Rotation  down  in  the  south  end 
where  we've  always  seen  It.  xxx  okay.  [Silence]  Oh  there's  a  clear 
rotation  near  that  notch,  cyclonic  rotation.  Okay.  [Silence]  huh  a  little 
bit  of  divergence  right  up  here.  25.8  degrees,  cyclonic  shear  to  the 
south,  probably  strong  rotation.  Huh.  [Silence]  Is  it  doing  anything? 
[referring  to  computer]  [Silence]  Oh  yeah  now  I  see  cnvergence  on  the 
western  end,  right  where  that  notch,  okay.  [Silence]  Oh  there's 
convergence  all  over  the  place,  34.8.  Uh  max  reflectivities  xxx  40  to  45. 
Okay. 

E:  That's  It  on  that  one. 

S:  Okay.  Descending  reflectivity  core,  it's  obvious.  Collapsing 
storm,  probably  is,  but  not  real  sure  yet.  Organized  convergence  or 
divergence  above  cloud  base,  you  betcha.  Not  much  convergence  at  or  below 
cloud  base,  I  didn't  seen  anything.  And  I'm  pretty  sure  I  didn't  see 
anything.  There's  a  reflectivity  notch.  There's  rotation.  I'm  a  little 
concerned  that  I  didn't  see  any  divergence  at  the  surface,  but  what  the 
heck.  90  percent,  or  is  this  [silence] 


Table  1 


Correlations  Among  Judgments  of  Descending  Core 


FI 

F2 

F4 

F2 

.14 

F4 

-.06 

.12 

F5 

.10 

.35 

-.14 

Table  2 

Correlations  Among  Judgments  of  Collapsing  Storm 


FI 

F2 

F4 

F2 

.69 

F4 

.47 

.53 

F5 

.57 

.40 

.17 

Tables 

Correlations  Among  Judgments  of  Convergence  Above  Cloud  Base 


FI 

F2 

F4 

F2 

.65 

F4 

,71 

.49 

F5 

.58 

.53 

.45 

Table  4 

Correlations  Among  Judgments  of  Convergence  at/or  Below  Cloud  Base 


FI 

F2 

F4 

F2 

.54 

F4 

.43 

.76 

F5 

.77 

.59 

.45 

A12 


Tables 


Correlations  Among  Judgments  of  Notch 


FI 

F2 

F4 

F2 

.38 

F4 

.51 

.25 

F5 

.61 

.57 

.34 

Tables 

Correlations  Among  Judgments  of  Rotation 


FI 

F2 

F4 

F2 

.06 

F4 

.12 

.39 

F5 

.51 

-.01 

.26 

Table? 

Correlations  Among  Judgments  of  Probability  of  a  Microburst 


FI 

F2 

F4 

F2 

.60 

F4 

.88 

.45 

F5 

.31 

.15 

.19 

A13 


J3  O 
O  -O 

«  £ 

20- 


(0 

<D 

3  to 

5  S 

'JZ  o 

5.  > 

S'  0- 

3 

CO 


o>  .2  n 

^  ^  ^ 

CO  Q.  (0 

O  A  o  o 

s  w  — 

t-  O  O 

.?  0. 


o>  £ 


^  (0  (0 
O  •D  tS 
O  (0  ^ 
S'  CE  Q 

o 


\.^  V 

N  \ At  1  / 


0  3- 
<0  *9  o 
E  2  *5 
.g  .2  2 
S  s  o 
g  »  c 

S  E  o 

2  S 

CO 


c/c/c/6  6  6  'b^D'b 


A14 


Figure  2:  Precursor  Judgment  Scales 


A15 


PLEASE  UST  ON  THE  BACK  OF  THIS  PAGE  ANY  OTHER  FACTORS 
YOU  CONSIDERED  WHEN  MAKING  YOUR  MICROBURST  JUDGMENT 


Appendix  6 

Cognitive  Aspects  of  Forecasting  the  Microburst: 
Research  Results,  Conclusions,  and  Recommendations 

Kenneth  R.  Hammond 


May  1988 


Center  for  Research  on  Judgment  and  Policy 
Institute  of  Cognitive  Science 
University  of  Colorado,  Boulder 


B1 


Dynamic  Tasks 
Kenneth  R.  Hammond 


Table  of  Contents 

Part  I:  Executive  Summary . 83 

Part  II:  Abstracts  of  Studies 
Study  I:  Agreement  among  forecasters 

under  a  "best-case”  scenario . 89 

Study  II:  Can  stonn  Images  be  constructed 

from  precursor  profiles? . 810 

Study  III:  Can  precursor  profiles  be  constructed 

from  storm  Images? . 811 

Study  IV:  Effects  of  the  1987  field  experience 
on  stability  of  the  judgment  process . 812 


Study  Y:  Evaluation  of  the  forecasting  accuracy  of 
scientifically  Ignorant  forecasting  equations  relative  to 


the  accuracy  of  perfectly  accurate  prediction  models . 813 

Study  VI:  A  laboratory  experiment:  Precursor 
assessment  by  forecasters . ' . 814 


Dynamic  Tasks 
Kenneth  R.  Hammond 


Part  I 

Executive  Summary 


Resume  of  Results 

The  role  of  the  judgnent  and  decision  psychologists  In  the  microburst 
project  has  been  to  study  the  cognitive  processes  of  research 
meteorologists  attempting  to  forecast  (nowcast)  microburst  events.  The 
goal  of  the  research  Is  to  assist  In  the  Improvement  of  such  forecasts. 

Six  studies  have  been  carried  out  since  April  1987.  Study  I  found  that: 

(a)  there  was  only  modest  agreement  between  and  within  forecasters  In 
forecasts  based  on  error-free  precursor  profiles  (a  best-case  scenario), 

(b)  a  linear  model  of  the  judgment  process  was  a  good  predictor  of 
forecaster  judgments,  and  (c)  It  was  a  better  predictor  than  the  process 
reported  by  the  forecasters.  Study  II  also  found  only  modest  agreement 
among  forecasters  when  they  were  asked  to  group  together  similar  precursor 
profiles  and  to  construct  storm  Images  from  them.  Study  III  examined  the 
reverse  process — constructing  precursor  profiles  from  the  forecasters' 
storm  Images— and  found  only  moderately  accurate  reproductions.  Study  IV 
found  that  although  the  1987  Buckley  field  experience  led  to  Improved 
agreement  among  forecaster  judgments,  agreement  regarding  similarity 
judgments  of  precursor  profiles  remained  modest.  To  summarize,  results 
from  Studies  I-IV  Indicate  that  the  human  forecasting  process  Is  far  from 
being  a  unified  one  and  far  from  being  a  consistent  process. 

Because  of  the  persistent  finding  of  only  modest  agreement  (I.e., 
modest  Inter-  and  Intra-observer  consistency)  among  forecasters  we  next 
Investigated  the  accuracy  of  an  algorithm  that  capitalized  on  providing 
entirely  consistent  (i.e.,  same  data,  same  response)  judgments  under 
conditions  that  formally  simulate  the  forecasting  situation  (even  though  a 
multivariate  analysis  of  precursor-microburst  (truth)  data  would  not  be 
available).  By  comparing  the  effect  of  perfect  consistency  with  perfect 
model  accuracy  1t  would  be  possible  to  discover  what  could  be  gained  by 
Improved  consistency  relative  to  Improved  scientific  understanding. 
Therefore,  Study  V  was  constructed  to  compare  the  accuracy  of  (a)  a 
scientifically  Ignorant  forecasting  equation  consistently  applied  with  (b) 
a  perfectly  accurate  conceptual  model  of  microburst  events  consistently 
applied  under  sixteen  different  conditions,  each  of  which  contained  100 
different  storms.  It  was  found  that  under  the  most  realistic  assumptions 
(moderate  Intercorrelations  among  precursors,  moderate  environmental 
uncertainty)  there  would  be  little  difference  In  accuracy  between  a 
scientifically  Ignorant  forecasting  equation  and  a  perfectly  correct  model, 
both  consistently  applied.  Thus,  the  results  Indicate  that  little  Is 
likely  to  be  gained  by  Improving  the  conceptual  model;  more  Is  apt  to  be 
gained  by  reducing  uncertainty  In  other  steps  In  the  forecasting  process. 
The  question  then  became:  Where  does  the  uncertainty  lie,  and  what  can  be 
done  to  reduce  It? 


B3 


Dynamic  Tasks 
Kenneth  R.  Hammond 


In  order  to  answer  these  questions  Study  VI  Investigated  forecaster 
judgments  based  on  actual  Doppler  radar  displays  of  six  cases.  Study  VI 
found  low  to  moderate  agreement  among  forecasters  In  the  Judgments  of 
precursor  values— particularly  In  judgments  of  “descending  core"— the 
precursor  given  the  most  weight  both  by  the  forecasters  empirically  In 
Studies  I-IV,  and  In  discussions.  For  example,  agreement  between 
forecasters  In  their  assessments  of  "descending  core"  ranged  from  -.14  to 
.35  (In  correlation  coefficients).  Two  forecasters  agreed  very  closely 
(.87)  over  the  six  cases  In  their  judgments  of  the  likelihood  of  a 
microburst,  yet  disagreed  completely  (-.06)  In  their  judgments  regarding 
descending  core.  Whatever  Is  causing  these  two  forecasters  to  agree  In 
their  judgments  of  microburst  probability.  It  Is  not  because  they  agree  on 
the  presence  or  absence  of  a  descending  core,  a  result  that  Is  Inexplicable 
from  verbal  explanations  of  the  forecasting  process.  Therefore  the  answer 
to  the  above  question— Where  does  the  uncertainty  He?— Is  that  It  lies  In 
precursor  judgments.  If  nowhere  else.  (Caution:  These  results  are  based 
only  on  six  cases.)  The  answer  to  the  question — What  can  be  done  to  reduce 
uncertainty?— Is  presented  under  "Recommendations"  below. 

Note.  It  Is  our  understanding,  based  on  conversations  with  Steve 
Campbell  (4  April  1988),  that  the  MIT  computer  program  for  detecting  the 
presence  and  location  of  "descending  cores"  Is  heavily  dependent  upon  the 
declaration  by  one  meteorologist  that  a  descending  core  Is  or  Is  not 
present.  Therefore,  Steve  Campbell  should  be  advised  of  the  results  of 
Study  VI. 


Conclusions 


1.  Taken  together,  the  results  of  Studies  I,  II,  III,  IV,  and  VI 
Indicate  that  forecasters'  predictive  accuracy  for  microburst  events  and 
null  events  must  be  low.  In  particular,  the  results  of  Study  VI  Indicated 
that  the  perceptual  assessments  of  procursor  conditions  by  the  forecasters 
are  a  major  source  of  error.  The  results  of  Study  V,  together  with  those 
of  Study  VI,  Indicate  that  although  Improvement  In  the  conceptual  model  Is 
not  likely  to  aid  matters  much.  Improvement  In  the  accuracy  of  precursor 
assessment  will . 

2.  Whatever  the  value  of  the  Roberts-Wllson  conceptual  model  might  be 
for  understanding  microbursts,  the  model  Is  of  little  practical  value  for 
predicting  the  occurrence  or  nonoccurrence  of  microbursts.  (This 
circumstance  Is  not  unusual;  understanding  and  prediction  are  not  always 
closely  linked.)  It  Is  clearly  possible  that  the  difficulty  Is  almost 
certainly  due  to  the  difficulties  of  information  processing,  rather  than  a 
lack  of  meteorological  knowledge.  For  as  matters  stand  now,  forecasters 
must  Incorporate  what,  by  any  standard.  Is  a  groat  deal  of  changing, 
ambiguous  Information  of  uncertain  value  without  the  necessary  cognitive 
supports.  They  must  make  many  (50-100?)  perceptual  judgnents  over  many 
different  volume  scans  regarding  both  reflectivity  and  velocity  readings  at 


Dynamic  Tasks 
Kenneth  R.  Hammond 


the  same  time  that  they  are  organizing  data  from  these  measurements  into  an 
overall  judgment  of  microburst  likelihood,  all  of  this  without  explicit 
definitions  of  such  Important  precursors  as  descending  core,  and  explicit 
methods  or  principles  for  combining  the  data  created  by  their  perceptual 
Judgments.  Accurate  forecasts  under  these  conditions  would,  in  fact,  be 
surprising. 

3.  The  perceptual  conditions  of  the  Doppler  display  do  not  favor 
accurate  assessments  of  precursors.  The  perceptual  environment  provided  by 
this  flat,  rectangular,  two-dimensional  array  of  numerous  color  contours 
is,  on  the  one  hand.  Impoverished;  It  does  not  provide  the  human  visual 
perceptual  system  with  the  rich  three-dimensional  display  of  objects  in 
textured  space  for  which  the  visual  system  is  so  well  adapted  and  for  which 
it  is  so  effective.  On  the  other  hand,  the  Doppler  radar  is  not  lean 
enough  to  provide  the  unambiguous  "pointer- readings"  of,  for  example,  the 
cockpit  instrument.  Furthermore,  present  conditions  require  that 
perceptual  judgments  of  both  velocities  and  reflectivities  must  be  made 
over  several  volume  scans,  a  cognitive  activity  that  makes  severe  demands 
on  memory.  In  short,  the  information  display  conditions  are  not  conducive 
to  accurate  perceptual  judgments  nor  to  the  integration  of  perceptual  data 
into  a  scientifically  based  judgment. 

4.  If  these  conclusions  ar«  correct,  and  we  see  no  hard  evidence  that 
would  challenge  them,  it  is  only  reasonable  to  raise  the  question  of  how 
best  to  remedy  the  situation.  Although  the  MIT  research  team  is  currently 
attempting  to  develop  a  computer  scanning  technique  that  will  identify 
precursors,  "descending  core"  in  particular,  the  procedure,  as  noted  above, 
is  dependent  upon  the  perceptual  judgments  of  descending  cores  made  by  a 
single  meteorologist.  Given  the  results  of  Study  VI,  these  results  are  not 
likely  to  be  replicable.  Therefore,  measurement  of  precursors  remains  tied 
to  the  perceptual  judgments  of  forecasters.  That  brings  us  to  the  second 
of  the  two  questions  above:  How  can  human  precursor  judgments  be  improved? 

Recommendation  1:  Create  Rigorous,  Explicit  Definitions  of  Precursors 

We  recommend  this  procedure  for  reasons  indicated  below. 

First,  recall  that;  (a)  Studies  I-V  show  that  increased  refinement  of 
the  Robe rts-Wi Ison  conceptual  model  offers  little  promise  for  improved 
forecasting/nowcasting,  and  (b)  the  specific  problem  to  be  faced  is  lack  of 
agreement  among  expert  research  meteorologists  in  their  perceptual 
judgments  of  precursor  events.  Improvement  in  precursor  measurement  can 
lead  to  more  accurate  forecasts,  even  If  a  scientifically  ignorant 
forecasting  equation  is  used.  Without  such  improvement,  even  a  perfectly 
accurate  conceptual  model  will  be  of  little  value. 


Dynamic  Tasks 
Kenneth  R.  Hammond 


Second,  note  that  at  present  there  appear  to  be  no  written-out, 
explicit,  agreed-upon  definitions  of  conditions  that  Indicate  the  presence 
of  precursors.  This  even  appears  to  be  true  of  “descending  core."  (Me  use 
the  word  "appear"  because  we  cannot  be  certain  that  such  definitions  do  not 
exist;  but  we  have  not  seen  them,  nor  have  any  of  the  forecasters  referred 
us  to  them,  and  at  least  one  meteorologist  reports  that  there  are  none.) 

Therefore,  if 

1.  the  descriptions  of  the  perceptual  data  measurements  and  the 
empirical  results  of  the  above  studies  (particularly  Study  VI)  are 
true, 

2.  it  is  true  that  there  is  no  set  of  explicit  definitions  or 
instructions  for  the  identification  of  precursors, 

then  we  recommend  that  steps  be  taken  to  remedy  the  omissions  described  in 
1.  and  2.  above,  and  we  offer  the  following  suggestions  regarding  the 
procedure  for  construction  of  definitions. 

These  should  follow  from  the  best  scientific  theoretical  base 
available.  These  theoretical  definitions  should  be  translated  Into 
observables  by  means  of  a  public  (i.e.,  use  of  more  than  one  expert) 
critique.  The  procedure  might  well  involve  schematic,  pictorial,  a’  ^ 
actual  Doppler  radar  pictorial  Images.  Empirical  tests  of  actual  agreement 
^definitions  (not  judgments)  should  be  employed,  rather  than  relying  on 
consensus  based  verbal  expression  of  agreement. 

Recommendation  2 :  Create  ^  Formal  Training  Program  for  the  Identification 
and  Assessment  of  Precursors 

There  appears  to  be  no  formal  training  and  evaluation  procedure  for 
the  judgment  of  precursors.  (Again,  we  use  the  word  "appears"  for  the  same 
reasons  as  above.)  Of  course,  we  realize  that  the  forecasters  studied  here 
have  spent  perhaps  thousands  of  hours  observing  Doppler  displays. 
Nevertheless,  such  experience  by  itself  cannot  substitute  for  formal 
training  exercises  that  track  performance  and  provide  feedback. 

Recommendation  Carry  Out  ^  Field  Performance  (or  Close  Simulation 
thereof)  Test 

It  cannot  be  taken  for  granted  that  once  agreement  on  theoretical  and 
operational  definitions  of  precursors  has  been  established,  and  training 
has  brought  perceptual  judgments  up  to  desired  agreement  levels,  that 
agreement  on  perceptual  judgments  of  the  chanolng  events  of  actual  cases 
will  follow.  Empirical  tests  should  be  used  (as  In  Study  VI)  to  detennine 
the  degree  of  agreement  under  close  simulation  of  working  conditions.  Such 
tests  are  essential  because  It  Is  presently  Impossible  to  establish 


B6 


Dynamic  Tasks 
Kenneth  R.  Hammond 


accuracy  by  comparing  precursor  judgments  of,  say,  descending  core  with 
independent  objective,  readily  verifiable  data.  Training  might  well  be 
carried  out  in  cooperation  with  psychologists  with  experience  in  Improving 
judgments  under  uncertainty. 

To  sum  up:  Three  recommendations  are  provided  for  ways  to  Improve 
predictions  of  microburst  events  on  the  basis  of  present  knowledge  afforded 
by  the  Roberts- Wilson  model  and  other  scientifically-based  information: 
These  are  (a)  clarification  of  definitions,  (b)  training,  and  (c)  simulated 
field  testing. 


Further  Implications  of  Studies  I-VI 

1.  Implications  Concerning  Multivariate  Analysis  of  Relations  Between 
'Precursors  and  Microbursts  and  Null  Events 

The  results  of  Study  VI  hold  considerable  significance  for  the 
evaluation  of  empirical  relations  between  precursors  and  "icroburst  and 
null  events.  Because  critically  Important  precursors  such  as  "descending 
core"  can  only  be  identified  by  forecasters*  perceptual  judgments,  and 
because  these  judgments  vary  considerably  among  forecasters,  there  is  no 
possibility  of  measuring  the  empirical  relation  between  judgments  of 
precursor  events  and  microbursts  and  null  events,  as  matters  stand  now.  If 
these  theoretical  relations  cannot  be  tested  empirically  then  the  value  of 
the  enormous  amount  of  data  and  precursor  judgments  already  recorded  must 
be  called  into  question.  Apparently  the  only  empirical  multivariate 
analysis  of  physical  events  that  can  be  done  would  involve  completely 
objective  data  (velocities,  reflectivities).  This  would  be  a  huge 
undertaking  with  results  of  doubtful  utility  because  of  the  size  of  the 
data  set.  But  this  study  should  be  given  serious  consideration. 

2.  Implications  for  Training  Other  Meteorologists 

If  there  is  little  agreement  among  the  MCAR  meteorologists  in  their 
judgments  of  precursors,  then  training  of  other  meteorologists  will  depend 
heavily  on  which  meterologist  is  the  trainer.  We  assume  that  this 
circumstance  is  undesirable. 

3.  Implications  for  Future  Research 

Apparently  Study  VI  is  one  of  the  first  attempts  to  carry  out  a  study 
of  forecasting  using  experimental,  laboratory- type  methods.  Therefore  we 
point  out  that  the  existence  of  considerable  amounts  of  Doppler  radar  tape 
means  that  it  is  now  possible  to  conduct  true  experiments  that  will  permit 
the  examination  of  (a)  the  relative  efficacy  of  various  forecasting 
methods,  (b)  the  relative  utility  of  various  aids  for  forecasters,  and  (c) 
the  relative  advantages  of  various  display  methods  and  equipment  (e.g.,  the 
three-dimensional  display)  as  well  as  (d)  the  cognitive  aspects  of 


Dynamic  Tasks 
Kenneth  R.  Hammond 


forecasting.  In  short,  the  new  ability  to  carry  out  experiments  may  offer 
meteorologists  new  opportunities,  particularly  if  experiments  similar  to 
those  in  Studies  VI  and  VI I  can  be  carried  out  prior  to  field  studies. 

Such  experiments  would  be  highly  cost  effective  relative  to  field  tests 
such  as  those  conducted  during  1987. 


B8 


Dynamic  Tasks 
Kenneth  R.  Hammond 


Part  II 

Abstracts  of  Studies 

Study  I:  Agreement  Among  Forecasters  Under  a  "Best-Case"  Scenario 

In  Study  I  the  meteorologists  Judged  the  likelihood  of  microbursts 
after  observing  precursor  profiles  of  storm  cells.  The  precursor  profiles 
provided  perfectly  reliable  observations  because  the  forecasters  were 
provided  with  exact  precursor  cue  values  rather  than  having  to  measure  the 
precursor  values  perceptually.  Thus  Study  1  provided  a  "best-case" 
scenario.  (It  is  a  best  case  scenario  because  if  the  forecasters  were 
required  to  Judge  the  values  of  the  precursors,  at  least  some  error  would 
be  introduced  in  the  forecasting  process  thus  lowering  the  agreement  among 
the  forecasters.  Additionally,  every  forecaster  thus  saw  exactly  the  same 
precursor  values.)  Results  indicated  that  the  forecasters'  Judgment  process 
is  adequately  represented  by  a  linear  model.  A  nonlinear  model  of  the 
forecasters'  cognitive  process,  which  the  forecasters  claim  to  use,  failed 
to  reproduce  the  forecasts  as  well  as  the  linear  model.  It  is  important  to 
note  that  only  modest  agreement  was  found  among  the  forecasters'  Judgments 
in  this  "best-case"  scenario. 


Dynamic  Tasks 
Kenneth  R.  Hammond 


Study  II:  Can  Storm  Images  Be  Constructed  from  Precursor  Profiles? 

Study  11  tested  the  meaningful ness  of  Study  I  by  Investigating  whether 
the  precursor  profiles  readily  evoked  pictorial  Images  of  storms.  This  was 
done  by  asking  the  meteorologists  to  sort  the  precursor  profiles  into 
similar  categories  and  then  to  draw  pictures  of  the  storms  the  categories 
represented.  Results  Indicated  that  Images  were  readily  evoked  by  the 
precursor  profiles,  thus  confirming  the  meaningful  ness  of  the  profiles  used 
in  Study  I.  In  addition,  the  forecasters'  sortings  provided  Independent 
confirmation  of  the  linear  model  of  information  integration  found  in  Study 
I  (i.e.,  the  linear  model  for  each  forecaster  predicted  his/her  sorting  of 
the  profiles).  Agreement  among  forecasters  with  regard  to  the  sorting  of 
profiles  was  found  to  be  modest,  however,  thus  suggesting  that  the  same 
error-free  precursor  values  may  give  rise  to  different  storm  Images  for 
different  forecasters. 


BIO 


Dynamic  Tasks 
Kenneth  R.  Hammond 


Study  III:  Can  Precursor  Profiles  Be  Constructed  from  Storm  Images? 

In  Study  II  we  asked  whether  storm  Images  could  be  constructed  from 
precursor  profiles  grouped  by  each  forecaster.  In  Study  III  we 
investigated  the  reverse  process;  can  forecasters  construct  precursor 
values  from  an  examination  of  storm  images?  To  what  extent  will 
forecasters  agree  on  the  precursor  values  when  they  observe  the  same  stonn 
images  (of  both  natural  and  radar  form)?  Forecasters  readily  constructed 
precursor  profiles,  but  some  precursor  values  (descending  core)  were  more 
accurately  predicted  than  others.  Agreement  among  forecasters  with  regard 
to  precursor  values  based  on  their  observation  of  the  storm  drawings  was 
only  modest. 


Bll 


Dynamic  Tasks 
Kenneth  R.  Hammond 


Study  IV;  Effects  of  the  1987  Field  Experience  on  Stability 

of  the  Judgment  Process 

The  effect  of  field  experience  on  the  persistence  of  the  linear  model 
as  a  representation  of  the  Judgment  process  and  agreement  among 
meteorologists  was  Investigated  by  asking  meteorologists  once  again  to  sort 
the  precursor  profiles  used  In  Study  I  Into  meaningful  categories.  The 
linear  model  again  was  found  to  predict  the  sorting,  thus  confirming  the 
results  obtained  In  Studies  I  and  II.  Agreement  among  the  meteorologists 
was  found  to  be  somewhat  higher  after  the  field  experience,  although 
considerable  disagreement  remained. 


BI2 


Dynamic  Tasks 
Kenneth  R.  Hammond 


Study  V;  Evaluation  of  the  Forecasting  Accuracy  of  Scientifically 
Ignorant  Forecasting  Equations  Relative  to  the  Accuracy 
of  Perfectly  Accurate  Prediction  Models 

Because  the  ultimate  aim  of  the  microburst  research  project  Is  to 
develop  an  algorithm  for  nowcasting  microbursts,  we  Investigated  the 
question  of  whether  It  Is  necessary  to  develop  a  sophisticated, 
scientifically  Informed  algorithm  to  do  that.  Could  a  simple, 
scientifically  and  empirically  Ignorant  forecasting  could  be  used  Instead? 
To  answer  this  question  we  created  sixteen  sets  of  100  different  storms 
each.  The  16  sets  were  created  from  (a)  two  different  storm  models  (one 
complex,  one  simple),  (b)  using  two  different  levels  of  Intercorrelation 
(one  zero,  one  moderately  high),  and  (c)  four  levels  of  uncertainty.  Two 
different  forecasting  equations  (one  complex,  one  simple)  were  applied  to 
the  1600  storms  thus  developed  and  the  relative  accuracy  of  each  was 
evaluated  quantitatively. 

The  results  were  consistent  with  those  from  previous  research.  The 
simple.  Incorrect  additive  equal  weight  forecasting  equation  was  as 
accurate  In  terms  of  both  hit  rates  and  correlation  coefficients  as  the 
complex,  correct  algorithm  when  (a)  precursors  were  Intercorrelated  and  (b) 
uncertainty  was  at  least  moderate.  This  result  argues  that  a  simple 
forecasting  equation  should  be  tested  with  actual  microburst  data  that 
Includes  ground  truth  as  soon  as  possible.  Should  that  test  confirm  the 
simulation  test  (as  we  believe  It  will),  plans  should  be  made  to  test  such 
an  equation  In  the  field  In  1988.  Perhaps  a  simple,  low  cost  algorithm 
that  will  meet  FAA  standards  of  accuracy  Is  already  available.  If  so, 
considerable  time,  energy,  and  money  might  be  saved. 


Dynamic  Tasks 
Kenneth  R.  Hammond 


Study  VI:  A  Laboratory  Experiment:  Precursor  Assessment  by  Forecasters 

The  results  of  Studies  I-V  cast  doubt  on  the  efficacy  of  judgments  of 
the  occurrence  of  microbursts.  All  five  studies,  however,  suffered  from  a 
lack  of  representativeness;  that  is,  the  forecasters  were  not  presented 
with  the  actual  circumstances  in  which  microburst  forecasts  are  made. 

Study  VI  was  designed  to  remedy  that  situation.  Rod  Potts  retrieved 
microburst  and  null  cases  from  the  1987  field  study.  Thus,  accuracy  of 
forecaster  judgments  under  representative  conditions  could  be  ascertained 
for  the  first  time,  as  well  as  agreement  on  judgments  of  precursor  values. 
Present  conclusions  are  based  on  results  obtained  from  three  null  cases  and 
three  microburst  cases. 

If  inter-observer  agreement  is  taken  as  a  measure  of  intra-observer 
agreement,  then  it  is  clear  that  forecasting  accuracy  cannot  reach  even 
modest  levels,  even  if  perfect  conceptual  models  were  to  be  developed. 

Given  the  results  of  Studies  I  through  VI  there  seems  to  be  little 
justification  for  trying  to  develop  better  conceptual  models.  Results  to 
date  argue  for  focusing  first  on  ascertaining  the  ecological  validity  of 
the  descending  core  precursor;  that  is,  what  is  the  empirical  relation 
between  the  best  available  measurement  of  descending  core  and  the 
appearance  or  nonappearance  of  microbursts?  To  the  best  of  our  knowledge, 
very  little  data  are  available  for  ascertaining  the  answer  to  this  question 
(although  Rod  Potts  is  collecting  such  data).  A  more  general  question  is 
that  of  how  best  to  assist  forecasters  in  their  efforts  to  identify 
precursor  conditions  displayed  on  the  Doppler  radar  screen. 


B14 


Appendix  C 

Further  Analyses  of  Expert's  Judgments  in  a  Dynamic  Task 

Kenneth  R.  Hammond 

September  1988 

Center  for  Research  on  Judgment  and  Policy 
Institute  of  Cognitive  Science 
University  of  Colorado,  Boulder 


Dynamic  Tasks 
Kenneth  R.  Hammond 


To  Investigate  (a)  the  impact  of  information  on  judgments  at  specific 
times  and  (b)  the  change  in  judgments  over  time,  we  graphed  each 
forecaster's  judgments  regarding  the  probability  of  a  microburst  as  a 
function  of  each  (volume)  scan.  These  graphs  are  presented  in  Figure  Al. 
Inspection  of  Figure  Al  yields  several  interesting  observations.  First, 
there  are  wide  individual  differences  in  forecaster's  predictions.  In 
addition,  the  subjects  did  not  converge  on  a  similar  judgment  with  the 
accumulation  of  more  evidence;  with  a  minor  exception,  accumulation  of 
evidence  has  little  effect  on  agreement.  Second,  judgments  change  very 
little  over  time;  the  lines  joining  subsequent  judgments  are  nearly  flat. 
Means  were  computed  for  the  difference  between  each  consecutive  probability 
judgment  for  each  forecaster  and  they  are  as  follows:  .06,  .08,  .10,  .13. 
Means  were  also  computed  for  the  difference  between  the  first  and  last 
judgments  in  a  given  case  for  each  forecaster  and  they  are  as  follows: 

.09,  .13,  .06,  .24.  In  short,  the  forecasters  change  their  probability 
judgments  by  only  about  ten  percent  on  average.  A  oneway  analysis  of 
variance  performed  for  each  forecaster  separately  yielded  no  statistically 
significant  differences  in  probability  judgments  due  to  new  Information  for 
any  of  the  forecasters.  This  is  a  surprising  and  Important  finding  for  two 
reasons:  (a)  the  forecasters  believe  that  their  judgments  are,  in  fact, 
influenced  by  incoming  information;  (b)  forecasts  may  be  as  accurate  when 
made  early  in  the  forecast  process  as  when  made  with  much  more  information 
at  a  later  time.  Taken  together,  these  conclusions  suggest  that 
forecasting  may  be  less  problematic  than  NCAR  currently  believes  it  to  be. 
Finally,  it  should  be  noted  that  the  study  would  not  have  been  carried  out, 
and,  therefore,  this  information  would  not  have  been  obtained,  without  the 
participation  of  psychologists.  One  implication  that  ARI  might  draw  from 
this  is  that  given  the  new  technological  capacity  to  run  dynamic  situations 
repeatedly  with  different  experts,  considerable  information  may  be  obtained 
about  judgment  and  decision  making  by  experts  in  dynamic  tasks  by 
researchers  in  the  field  of  judgment  and  decision  making. 


X  kk 


